AITopics

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.78)

Neural Information Processing SystemsMar-22-2026, 22:52:44 GMT

MatFormer: Nested Transformer for Elastic Inference

Foundation models are applied in a broad spectrum of settings with different inference constraints, from massive multi-accelerator clusters to resource-constrained standalone mobile devices. However, the substantial costs associated with training these models often limit the number of unique model sizes that can be offered. Consequently, practitioners are compelled to select a model that may not be optimally aligned with their specific latency and cost requirements. We present MatFormer, a novel Transformer architecture designed to provide elastic inference across diverse deployment constraints. MatFormer achieves this by incorporating a nested Feed Forward Network (FFN) block structure within a standard Transformer model.

artificial intelligence, machine learning, proceedings, (6 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.59)

Neural Information Processing SystemsFeb-12-2026, 04:03:27 GMT

59404fb89d6194641c69ae99ecdf8f6d-Paper-Conference.pdf

large language model, machine learning, natural language, (18 more...)

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Ohio (0.04)
North America > United States > Maryland > Baltimore County (0.04)
(3 more...)

Genre: Research Report > New Finding (0.68)

Industry:

Law (0.68)
Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Neural Information Processing SystemsFeb-11-2026, 05:13:30 GMT

fb2697869f56484404c8ceee2985b01d-AuthorFeedback.pdf

cifar100, fusion, individual model, (17 more...)

Genre: Research Report > New Finding (0.35)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.50)

MIT Technology ReviewJan-7-2026, 11:23:47 GMT

LLMs contain a LOT of parameters. But what's a parameter?

LLMs contain a LOT of parameters. They're the mysterious numbers that make your favorite AI models tick. What are they and what do they do? I am writing this because one of my editors woke up in the middle of the night and scribbled on a bedside notepad: "What is a parameter?" Unlike a lot of thoughts that hit at 4 a.m., it's a really good question--one that goes right to the heart of how large language models work. A large language model's parameters are often said to be the dials and levers that control how it behaves.

dimension, llm, neuron, (14 more...)

MIT Technology Review

Country:

North America > United States > Massachusetts (0.04)
Asia > China (0.04)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceNov-25-2025

ContrastScore: Towards Higher Quality, Less Biased, More Efficient Evaluation Metrics with Contrastive Evaluation

Wang, Xiao, Larionov, Daniil, Wu, Siwei, Liu, Yiqi, Eger, Steffen, Moosavi, Nafise Sadat, Lin, Chenghua

Evaluating the quality of generated text automatically remains a significant challenge. Conventional reference-based metrics have been shown to exhibit relatively weak correlation with human evaluations. Recent research advocates the use of large language models (LLMs) as source-based metrics for natural language generation (NLG) assessment. While promising, LLM-based metrics, particularly those using smaller models, still fall short in aligning with human judgments. In this work, we introduce ContrastScore, a contrastive evaluation metric designed to enable higher-quality, less biased, and more efficient assessment of generated text. We evaluate ContrastScore on two NLG tasks: machine translation and summarization. Experimental results show that ContrastScore consistently achieves stronger correlation with human judgments than both single-model and ensemble-based baselines. Notably, ContrastScore based on Qwen 3B and 0.5B even outperforms Qwen 7B, despite having only half as many parameters, demonstrating its efficiency. Furthermore, it effectively mitigates common evaluation biases such as length and likelihood preferences, resulting in more robust automatic evaluation.

computational linguistic, large language model, machine learning, (17 more...)

2504.02106

Country:

Asia (1.00)
North America > United States (0.68)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Artificial IntelligenceNov-12-2025

LLM Optimization Unlocks Real-Time Pairwise Reranking

Wu, Jingyu, Shrivastava, Aditya, Zhu, Jing, Samuel, Alfy, Kumar, Anoop, Liu, Daben

Efficiently reranking documents retrieved from information retrieval (IR) pipelines to enhance overall quality of Retrieval-Augmented Generation (RAG) system remains an important yet challenging problem. Recent studies have highlighted the importance of Large Language Models (LLMs) in reranking tasks. In particular, Pairwise Reranking Prompting (PRP) has emerged as a promising plug-and-play approach due to its usability and effectiveness. However, the inherent complexity of the algorithm, coupled with the high computational demands and latency incurred due to LLMs, raises concerns about its feasibility in real-time applications. To address these challenges, this paper presents a focused study on pairwise reranking, demonstrating that carefully applied optimization methods can significantly mitigate these issues. By implementing these methods, we achieve a remarkable latency reduction of up to 166 times, from 61.36 seconds to 0.37 seconds per query, with an insignificant drop in performance measured by Recall@k. Our study highlights the importance of design choices that were previously overlooked, such as using smaller models, limiting the reranked set, using lower precision, reducing positional bias with one-directional order inference, and restricting output tokens. These optimizations make LLM-based reranking substantially more efficient and feasible for latency-sensitive, real-world deployments.

large language model, machine learning, natural language, (17 more...)

2511.07555

Country:

Asia (0.68)
North America > Mexico (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

arXiv.org Artificial IntelligenceNov-4-2025

Enhancing Reasoning Abilities of Small LLMs with Cognitive Alignment

Cai, Wenrui, Wang, Chengyu, Yan, Junbing, Huang, Jun, Fang, Xiangzhong

The reasoning capabilities of large reasoning models (LRMs), such as OpenAI's o1 and DeepSeek-R1, have seen substantial advancements through deep thinking. However, these enhancements come with significant resource demands, underscoring the need for training effective small reasoning models. A critical challenge is that small models possess different reasoning capacities and cognitive trajectories compared with their larger counterparts. Hence, directly distilling chain-of-thought (CoT) rationales from large LRMs to smaller ones can sometimes be ineffective and often requires a substantial amount of annotated data. In this paper, we first introduce a novel Critique-Rethink-Verify (CRV) system, designed for training smaller yet powerful LRMs. Our CRV system consists of multiple LLM agents, each specializing in unique tasks: (i) critiquing the CoT rationales according to the cognitive capabilities of smaller models, (ii) rethinking and refining these CoTs based on the critiques, and (iii) verifying the correctness of the refined results. Building on the CRV system, we further propose the Cognitive Preference Optimization (CogPO) algorithm to continuously enhance the reasoning abilities of smaller models by aligning their reasoning processes with their cognitive capacities. Comprehensive evaluations on challenging reasoning benchmarks demonstrate the efficacy of our CRV+CogPO framework, which outperforms other methods by a large margin.

large language model, machine learning, reasoning process, (19 more...)

2504.09802

Country: Asia > China (0.28)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)

Pisarevskaya, Dina, Zubiaga, Arkaitz

Agent-based Automated Claim Matching with Instruction-following LLMs

arXiv.org Artificial IntelligenceOct-29-2025

We present a novel agent-based approach for the automated claim matching task with instruction-following LLMs. We propose a two-step pipeline that first generates prompts with LLMs, to then perform claim matching as a binary classification task with LLMs. We demonstrate that LLM-generated prompts can outperform SOTA with human-generated prompts, and that smaller LLMs can do as well as larger ones in the generation process, allowing to save computational resources. We also demonstrate the effectiveness of using different LLMs for each step of the pipeline, i.e. using an LLM for prompt generation, and another for claim matching. Our investigation into the prompt generation process in turn reveals insights into the LLMs' understanding of claim matching.

computational linguistic, large language model, natural language, (17 more...)

2510.23924

Country:

North America > United States (0.93)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)

Genre: Overview (0.93)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Jalilifard, Amir, Rocha, Anderson de Rezende, Raimundo, Marcos Medeiros

Reasoning Distillation and Structural Alignment for Improved Code Generation

arXiv.org Artificial IntelligenceOct-21-2025

Effective code generation with language models hinges on two critical factors: accurately understanding the intent of the prompt and generating code that applies algorithmic reasoning to produce correct solutions capable of passing diverse test cases while adhering to the syntax of the target programming language. Unlike other language tasks, code generation requires more than accurate token prediction; it demands comprehension of solution-level and structural relationships rather than merely generating the most likely tokens. very large language model (VLLM) are capable of generating detailed steps toward the correct solution of complex tasks where reasoning is crucial in solving the problem. Such reasoning capabilities may be absent in smaller language models. Therefore, in this work, we distill the reasoning capabilities of a VLLM into a smaller, more efficient model that is faster and cheaper to deploy. Our approach trains the model to emulate the reasoning and problem-solving abilities of the VLLM by learning to identify correct solution pathways and establishing a structural correspondence between problem definitions and potential solutions through a novel method of structure-aware loss optimization. This enables the model to transcend token-level generation and to deeply grasp the overarching structure of solutions for given problems. Experimental results show that our fine-tuned model, developed through a cheap and simple to implement process, significantly outperforms our baseline model in terms of pass@1, average data flow, and average syntax match metrics across the MBPP, MBPP Plus, and HumanEval benchmarks.

arxiv preprint arxiv, large language model, machine learning, (19 more...)

2510.17598

Genre:

Research Report > New Finding (0.66)
Research Report > Promising Solution (0.54)

Industry: Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Automatic Programming (0.84)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)